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Abstract 

' Generalizations of linear numeration systems in which IN is recognizable by 

finite automata are obtained by describing an arbitrary infinite regular language 
q ! following the lexicographic ordering. For these systems of numeration, we show 

that ultimately periodic sets are recognizable. We also study the translation and 
the multiplication by constants as well as the order-dependence of the recogniz- 
>^ '. ability. 

o 
o 

£2 : 1 Introduction 
o 

Q\ - 

Q\ • A series of recent papers are devoted to numeration systems || ||, ||, |9|, [12], [14|, |19| and are 



mainly concerned with the study of the so-called recognizable sets of integers. Roughly 
speaking, a set of integers is recognizable if their representations have a very simple 
syntax, i.e. if they form a regular language. 

An usual way of representing integers, leading to the so-called linear representation 
systems, is to consider a strictly increasing sequence (U n ) ne ^ of integers and to use some 
algorithm (such as the greedy algorithm) to represent each natural number x by a word 
Co . . . c n such that coU n + • ■ • + c n Uo = x [pTlJ] . For example, with U n = p n and the greedy 
algorithm, one gets the standard numeration system with basis p. 

Among the sets of integers possibly recognizable, IN is of special interest. For in- 
stance, if it is recognizable, then one can easily check whether a word over the alphabet 
of the digits represents an integer or not. Under quite general assumptions, it is shown 
in [plj that for IN to be recognizable, it is necessary that U n satisfies a linear recurrence 



relation. The sufficient condition given in JT4| is that U n satisfies an extended beta poly- 



nomial for the dominant root (3 > 1 of the recurrence . Examples of such systems are the 
numeration systems defined by a recurrence relation whose characteristic polynomial is 
the minimum polynomial of a Pisot number (like the standard numeration systems or 
the Fibonacci system 0). 

A nice description of the recognizable sets has been obtained for the latter 0, ||. They 
are the sets of integers that can be defined in the Presburger arithmetic extended by 
some predicate related to the considered Pisot number. In particular, various operations 
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do preserve the recognizability, such as addition, translation and multiplication by a 
constant. 

In ||, Cobham shows that the only sets that are simultaneously recognizable with 
respect to two standard numeration systems having multiplicatively independent integer 
basis are precisely the finite unions of arithmetic progressions. This remarkable result 



has been extended to non standard linear systems g, |13| [18| , the more general version 
being obtained quite recently in |J. 

In the above mentionned results, a property of the considered systems seems to play 
a crucial role: the representation x G IN i— ► r(x) G {digits}* is increasing with respect 
to the lexicographic ordering (this is an assumption in |19|], it is a consequence of the 
greedy algorithm). Observe that a numeration system having this property is completely 
determined by the language r(!N) and the ordering of the digits, the sequence U n and 
the algorithm defining the system being just extra data devised to compute the function 
r in some "practical" fashion. 

Taking this into account, we thus define an (abstract) numeration system as being 
a triple S = (L, E, <) where L is an infinite language over the totally ordered alphabet 
(£, <). Enumerating the elements of L lexicographically with respect to < leads to a 
one-to-one map rs from IN onto L. To any natural number n, it assigns the {n + l) th 
word of L, its S -representation, while the reciprocal map vals sends any word belonging 
to L onto its numerical value. A subset X C IN is said to be S -recognizable if ?s{X) is 
a regular subset of L. 

Having in mind a possible generalization of the Cobham's theorem, it is natural to 
check whether the ultimately periodic subsets of IN are S-recognizable . Of course, if 
they are, then L = r^QN) is regular. It is a quite remarkable fact that conversely, if L 
is regular, then every arithmetic progression is indeed S'-recognizable (a special case of 
this result has been obtained separately in [|T^] ). 

As recalled above, the recognizability of IN is an important property that is often 
required. Unless otherwise stated, we assume in the sequel that L is a regular language. 
Under this assumption, we obtain algorithms to compute r s and vals- The first is a 
generalization of the greedy algorithm involving the complexity functions of the states 
w .L, w G £*, of the minimal automaton of L in place of the sequence U n (for more 
about minimal automaton, see for instance |T(|) . Both proved to be quite usefull in 
many concrete experiments. 

In a positional numeration system, each digit has its own weight so that the question 
of changing the order of the digits is somewhat irrelevant in this case. In an abstract 
numeration system, the letters has no a priori individual role and, as we show with 
the help of the language {a,b}* \ a*b*, the family of recognizable sets depends on the 
ordering of the alphabet. However, we exhibit two classes of regular languages for which 
the recognizability of a set of integers is independent of the order on the alphabet. One of 
these classes is the set of the slender languages M . The other is the set of the languages 
LcE* for which the complexity functions of the associated languages w~ l .L differ only 
at finitely many places. 

As for the stability of the recognizability under natural arithmetic operations, we 
show that for each t, a subset A of IN is S'-recognizable if and only if X + t is S- 
recognizable. On the other hand, multiplication by a constant generally does not preserve 
recognizability so that addition is not a regular map as well. For example, in the 
numeration system S based on the language a* b*, the set of t G IN for which tX is S- 



recognizable if X is S'-recognizable consists of the perfect squares. Note that in this case, 
the function vahj is nothing else but the well known Peano's function [21]; surprisingly, 
the proof of the result is difficult and it relies partly on the properties of the Pell's 
equation M, |20 



2 Basic definitions and notations 

In this paper, if E is a finite alphabet then E* is the free monoid (with identity e) 
generated by E. For a set S, #S denotes the cardinality of S and for a string w G E*, 
\w\ denotes the length of w. 

Let L C E* be a regular language. We denote M L = (K, s, F, 5, E) the minimal 
automaton of L where K is the set of states, s is the initial state, F is the set of final 
states and 5 : K x E — > K is the transition function. We often write k.a instead of 
8{k,a). 

Recall that the elements of K are the sets u> -1 .L = {v G E* : wv G L}, w G E*. The 
state k is of the form w~ x .L if and only if k = s.w, w~ x .L being then the set of words 
accepted by M L from k. In particular, L = L s . 

We denote Ui(k) the number #(L fc n S') of words of length I belonging to L k and 
Vi(k) the number of words of length at most I belonging to Lk, 

i 

If we are only interested in the number of words belonging to L, then we simply note ui 
and vi instead of ui(s) and Vi(s) provided that it does not lead to any confusion. 

Definition 1 A numeration system S is a triple (L, E, <) where L is an infinite regular 
language over the totally ordered alphabet (E, <). 

For each n G IN, Ts(n) denotes the (n+l) th word of L with respect to the lexicographic 
ordering and is called the S -representation of n. 

Remark that the map is '■ IN — > L is an increasing bijection. For w G L, we set 
vals(u>) = r^ 1 (w). We call vals(to) the numerical value of w. 

Definition 2 Let S be a numeration system. A subset A of IN is S -recognizable if Ts(X) 
is recognizable by finite automata. 

Let S = (L, E, <) be a numeration system. Each k G K for which Lk is infinite leads 
to the numeration system Sk = (L k ,T,,<). The applications r 5fe and vals fc are simply 
denoted r& and val^ if the context is clear. If Lk is finite, the applications r^ and val^ are 
defined as in the infinite case but the domain of the former restricts to {0, . . . , j^Lk — 1}. 

3 Computation of vals and recognizability of ulti- 
mately periodic sets 

In this section, given any numeration system S = (L, E, <), we indicate how to compute 
the function vals and show that the arithmetic progressions p + IN q are S'-recognizable. 



We first need a lemma. 



Lemma 3 Let S = (L, S, <) be a numeration system. If a (3 belongs to L k , a,/3E S + ; 
then 

val k (a/3) = val fc . a (/3) + v\ a p\_i(k) - v\^t(k.a) + u\p\{k.a'). 

\a f \ — \ot\ 

Proof. We have to compute the number of words belonging to L k and lexicographically 
strictly lesser than af3. There are three kinds of such words. The first consists of words 
of length strictly lesser than a(3 and counts v\ a p\^\{k) elements. The next one consists 
of words of length \a(3\ admitting the prefix a. Since a word a' (3' belongs to L k if and 
only if j3' belongs to L ka >, we see that there is valfc. a (/3) — v^\_i(k.a) such words. It is 
clear that there is 

#{w G L k : w = a'j3', \a'\ = \a\, = \/3\ and a' < a} = u\m(k.a') 

| ot 1 1 = I ot I 

words of the last kind. □ 

Remark 1 Taking for a a letter in lemma |3] one would deduce easily an effective algo- 
rithm to compute vals. 

Remark 2 It follows also from lemma ^ that for each word w, 

val s (w) = c k>l ui(k) 

0<K\w\ 

where the "digits" c k j are less or equal to #£. 

Theorem 4 Let S = (L, S, <) be a numeration system and p, q two non negative 
integers. The arithmetic progression p + INg is S -recognizable. 

Proof. We can assume that p < q. We show that the the minimal automaton of 
A = r s (p + INg) is finite. Its states are the sets 

w^.A = {x G S* : val,s(u;a;) = p mod q}, w G S*. 

Observe first that the sequence v n (s) being a solution of a linear recurrence equation, is 
ultimately periodic in 7L qi say of period t. By lemma |3|, for |iw| large enough, w^.A is 
thus of the form 

{x : vsl k (x) +v\ x \ +i (s) - + 3k' u\ x \{k') = p mod q} 

k'&K 

for some k G K, j k > G {0, . . . , q — 1} and i G {0, — 1}. □ 
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4 Computation of and reordering of the alphabet 



We now explain how to compute effectively is and discuss to what extend the S- 
recognizable subsets of IN depend on the ordering of the alphabet. 

Let S = (L, S, <) be a numeration system, where S = {cr\ < • • ■ < a p }. 

It is clear that 

l r s(^)l = inflm I n < v m }. 

m 

Set |r5(n)| = I then n — V\-\ is the number of words of length I belonging to L and 
strictly lesser than rs(n). 

To determine the first letter of the representation, we have to compute the number 
Nl of words of length I belonging to L and begining with <j\ or ... or o t it < p) 

N l t =Y,u^{ar\L). 

8=1 

If Nl_ 1 < n — v^i < Nl then the first letter is o~ t . We proceed in the same way to 
find out the other letters of the representation. Recall that if A; is a state of Ml then 
5(k,o~j) = aj x .k. Hence, the following algorithm that computes the S'-representation w 
of a given integer n. 

Algorithm 1 Let I such that < n < v\, 
k <— s 

m n — vi-i 
w <— e 

for i ranging from 1 to I do 
j <- 1 

while m > ui-i[6(k, (Jjj\ do 
m <—m- ui-i[8{k,(Tj)\ 

3^3 + 1 
k 5(k, o~j) 

W <— WCTj. 



Remark 3 If lim = 9 < oo then the temporal complexity of the algorithm is 

0((#E)log e n). 

As an easy application of algorithm |], we obtain a first class of numeration systems 
in which the recognizable sets are independent of the order of the alphabet. 

It is convenient to introduce notations for the change of numeration systems. Given 
systems S = (L, S, <) and T = (L', -<), we set 

@5,t = r T ° vals : L —>■ L' and Q' ST = vafr ° rs : IN — > IN. 
If the underlying S and T are known from the context, we simply write and G'. 

Proposition 5 Let S = {L, S, <) and T = (L, S, -<) be two numeration systems. Let 
no be a non negative integer. If for all states k and k! of Ml = (K, s, F, S, £), 

u n (k) = u n (k'), Wn > n , 

then X C IN is S -recognizable if and only if X is T -recognizable. 
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Proof. Assume that £ = {at < ■ ■ ■ < a p } = {a Ul -<•••-< <j v } where v is a permutation 
of {l,...,p}. We prove that the graph 6 = {(x,y) G L x L : vals(x) = valr(?/)} 
of 9 is regular over the alphabet E x S, showing that it(X) = ^2(6 H pjf (rs(Jf)) is 
regular if and only if rs(X) is regular, where pi,p2 '■ (S x S)* — > X* are the canonical 
homomorphisms of projection. 

Let (x, y) belonging to 0. The two systems S and T have the same sequence {v n ) n ^, 
thus \x\ = \y\. 

By algorithm [l], if \x\ > ^0 then 

x = <7 ix . . . a k ft and y = o ivx . . . a ivi f3' 

where \/3\ = \/3'\ = no, (3 G L s a , f3' G L s , a i and 

val 5s . Q (/?)=val TsQ , (f3>). 

To conclude, it is then sufficient to observe that the words of B of length at least no 
are exactly the words accepted by the following nondeterministic finite automaton. The 
set of states is (K x K) U {/}. The initial state is (s,s). The new symbol / denotes 
the unique final state. According to what precedes, there are two kinds of transitions. 
First those of label (0^, a Ui ) mapping the state (k, k') onto (fc.crj, k'.a Ui ). Second those of 
label (/?,/?') mapping (k, k') onto /, provided that \(3\ = = n , (3 G L k , (3' G L k i and 
val 5fc (/3) = val Tfc , (/?'). □ 

Example 1 The language over the alphabet {a, b} consisting of the words containing an 
even number of a satisfies the hypothesis of proposition |3|. 

In the next proposition, we give equivalent formulations of the assumption of proposi- 
tion [|. They are expressed in terms of the incidence matrix Al of the minimal automaton 
Ml of L. Recall that it is the matrix defined by 

p 

t=l 

where the a t 's and the k^s denote the p letters and the k states of respectively. 
We denote ft the characteristic vector of the set of final states: 




1 if ki G F 
otherwise. 



Observe that 

W/)i = EW)«/i = «m(0- (!) 
3=1 



Proposition 6 Let L be a regular language over an alphabet E and Ml = (K, s, F, S, E) 
be its minimal automaton. Let m be the multiplicity of as root of the minimum poly- 
nomial of A L . Let r > m. The next assertions are equivalent 

1. Vn > r, Vk, k' G K, u n (k) = u n (k'), 

2. \/n > m, Vfc, k! G K, u n (k) = u n (k'), 

3. 3A G 1N ; A%f = \v, with v = (1, . . . , 1) ~. 

In particular, \/k G K , Wi > 0, u m+ i(k) = (#E)'w m (/c). 

Proof. This follows immediately from ([[]) and the well known fact that any polynomial 
that is cancelled out by Al is the characteristic polynomial of a linear recurrence equation 
satisfied by each of the sequences u n {k). □ 

Here is another easy characterization of the languages for which the assumption of 
proposition | holds true. 

Proposition 7 Let L be a regular language over an alphabet E. It satisfies the hypoth- 
esis of proposition [| if and only if there exist Uq,Uq G IN such that for all w G E* ; 
n E no ) =u . □ 

The set of slender languages is the second class of languages for which the recognizable 
sets of integers are independent of the ordering of the alphabet. 

Definition 8 [|I|] Let d be a positive integer. The language L is said to be d-slender if 

Vn > 0, u n (s) < d, 
L is said to be slender if there exists d such that L is <i-slender. 

Lemma 9 |0| Let L be a regular language over the totally ordered alphabet (E, <). The 
setI(L, <) (resp. Q(L, <)) obtained by taking from all the words of L of the same length 
only the first (resp. last) one in the lexicographic order is regular. □ 

Proposition 10 Let d be a positive integer. Let L be a regular d-slender language. Let 
S = (L, E, <) and T = (L, E, -<) be two numeration systems. If X C IN is S -recognizable 
then X is T -recognizable. 

Proof. Like in the proof of Proposition [|, we show that the graph G of the change of 
systems is regular. Using lemma ||, we define iteratively the regular languages I i;< and 
by 

I h< = J(L,<) 
J M = J(L,-(), 

and, for % = 2, . . . , d, 

| I it< = J[L\(*U /*<),<] 
= I[L\(X)Ij,*),*). 

I 3=1 



Since for all x G L, \x\ = \Q(x)\, the graph of is thus given by 

= U [(/,,< x i jt< ) n (£ x S)*] . □ 
J'=l 

In spite of the two previous propositions, the change of ordering of the alphabet 
generally does not preserve the recognizability as we shall see about £ = {a, b} and 
L = £* \ a*b*. 

Lemma 11 Let n G IN. For U = (£*, E, a < b) and V = (£*, £, 6 -< a) one has 

Q' uv (n) = 3.2 1 - n - 3, 

where I = |r{/(n)|. 

Proof. Observe that since = 2', if u>i < • • • < -U7 2 ! then w 2 i -<■■■-< W\. Moreover 
2 l - 1 < n < 2 l+l - 2. Thus 

&'(n) = 2 l+1 -2-[n-(2 l -1)}. □ 



Proposition 12 Let E = {a, b} and L = E* \ a*o*. For a// n > 2, if I = |r^(n - 1)| 

05,T(fea&") = aba^-Hiuin- 1), 

where S — (L, E, a < 6), T = (L, E,6 -< a) and [/ = (E*,E,a < 6). In particular, 
va\s(bab 2 b*) is not T -recognizable. 

Proof. The minimal automaton Ml of L is given by 



-M s 



a, 6 



Figure 1. The minimal automaton of E* \ a*b*. 



Therefore L p = E* 



J tto(s) = ui(s) = 0, 

\ u n (s) = 2 n -n-l, Vn > 2, 

while u n (t) = 2 n - 1 for all n G IN. 

In L, there are i> n +i(s) words of length at most n + 1, w„ + i(s) words of length n + 2 
begining with a and w n (p) — 1 words of length n + 2 begining with ba. Hence, the number 
of words belonging to L and lexicographically lesser than ba b n is 

n+l 

val 5 (6a o n ) = £ ( 2l " * " X ) + 2 " +1 + 2 n - n - 3. 

i=2 
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Using lemma |3], we sketch the computation of vslT[aba n biu(n — 1)] 

n+l 

= val t [a n-l_1 bruin - 1)] + £ (2* - z - 1) + 2 n + n 

i=2 
n+l 

= val p [a n ~'- 2 6r y (n - 1)] + £ (2 4 - % - 1) + 2 n+1 - 1 

i=2 

n+l n— 1 

= v&\ p [br u (n-l)}+ £(2 i -2-l) + 2"+ 1 -l + £ 2* 

i=2 i=/+2 
n+l n— 1 

= val P Mn - 1)] + £ (2 4 - i - 1) + 2 n+1 - 1 + £ 2* + 2 Z 

i=2 i=i+2 

n+l 

= &(n - 1) + £ (2 l - i - 1) + 2 n+1 - 1 + 2 n - 3.2 Z . 

i=2 

Hence the value of Qs,T{bab n ), in view of lemma Applying the pumping lemma, it 
is now straightforward to check that vals(bab 2 b*) is not T-recognizable. □ 

5 Translation by a constant 

Here we show that the S-recognizability of a set is conserved under the translation by a 
constant. First we recall some classical results about numeration systems. 



Lemma 13 |12| Let p E IN \ {0, 1}. The normalization function 

i/:{l,...,p}*-{0,...,p-l}* 

which gives the normalized representation in base p of an integer (the representation 
obtained by the greedy algorithm) is a rational function, its graph v is recognizable by a 
finite letter-to-letter automaton. □ 

Lemma 14 Q A subset o/lN is recognizable in base p > 2 if and only if it is definable 
in the structure (IN, +, V p ) , where for x ^ 0, V p {x) is the greatest power of p dividing x 
while V p (0) = 1. □ 

Proposition 15 Let S = (L, S, <) be a numeration system. For each natural number 
t, X + t is S -recognizable if X C IN is S -recognizable. 

Proof. Let S = {a~i < ••■ < a p } and let the homomorphism h : S* — > {l,...,p}* 
be defined by /i : <Tj i— > i. For x G IN, the word h(rs(x)) — xq . . . xi 6 {1, . . . is a 
representation in base p of the integer 7r p (/i(rg(a;))) = x p' + ■ ■ • + xip°. 

Since L is regular over S, by lemma [13], u(h(L)) is regular over {0, . . . ,p — 1} and 



by lemma II], the set 

A/- = 7r P K/i(L))] 

is definable in (IN, +, V p ). 

The successor function Sl '■ L —>■ L (with respect to the lexicographic order) is then 
regular. Indeed, S = n p oi/ oho Sl° (vt p o v o h)^ 1 is the restriction to M of the fucntion 
ii->!/ defined in (IN, +, V p ) by the formula 

(y E AT) A (x < y) A (iz)(z e M A x < z) -> (y < z). 

Assume now that X is S'-recognizable, i.e. that ?s{X) is a regular set. Then 
i s (X + t) = S{(t s (X)) is regular. □ 
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6 Multiplication by a constant 



In this section, we show that, in general, the multiplication by a constant does not 
preserve the recognizability. To that end, we use the system S = (a*b*,{a,b},a < b), 
for which it is easy to see that 

val s (a p b q ) = -(p + q)(p + q + 1) + q. 
Remark 4 Observe that the r.h.s. is nothing else but the well-known Peano's function 

H- 

It would suffice to show that, say, the multiplication by two does not preserve recog- 
nizability but here we are lucky enough to get more. 

Theorem 16 Let S be the numeration system (a*b*, {a,b},a < b) and let a G IN. The 
multiplication by a transforms the S -recognizable sets into S -recognizable sets if and only 
if a is a perfect square. 

Proof, (i) Sketch. If a is not a perfect square, we show that for a suitably choosen r, 

C r a = a r b* n r 5 (aval 5 (a*)) 

is infinite while the set of lengths \C r a \ only contains finite arithmetic progressions so 
that rs (avals (a*)) is not even context free, thanks to Parikh's theorem [|15|j . 

If a = p 2 , IN 2 is divided into (3 + 1 regions Ri in each of which an explicit formula 
for the function M : (p, q) t— > (r,s) such that avals(a p b q ) = vals(a r b s ) can be supplied. 
These regions come from length considerations: given a word of length I and of numerical 
value x, there is j3 + 1 possible lengths for the word of value ax. The fact that the 
multiplication by a preserves the regularity of the subsets of a* b* follows then from an 
easy lemma. 

(ii) Case of a non perfect square. Let a be a non perfect square integer. We have 

le\C r a \^3p: v&l s {a r b l ~ r ) = aval s (a p ). 
In other words, / G \C r a \ if and only if 

[2(r + s) + 3] 2 -a(2p+ l) 2 = 8r + 9 -a (2) 

for some p, where s = I — r. 

To guarantee that \C r a \ be infinite, we choose r in such a way that 

X 2 - aY 2 = 8r + 9 - a (3) 

has infinitely many solutions with odd components. To that purpose, it suffices to choose 
r such that 8r + 9 — a > and that the equation (Q) admits a solution (x, 1) with x 
odd (cf. Appendix). This can be achieved with r of the form z 2 . Indeed, the equation 
x 2 — 8z 2 = 9 has infinitely many solutions given by 




, Vz G IN. 



i n 



The XiS are odd. We choose i such that 8z 2 + 9 — a > and take x = Xi. 

The set of the solutions of with odd components is a finite union of sequences 
(X®, YW) neJN , j = 1, . . . , m, such that X® > C n for some C > 1 (cf. Appendix). 

We are now in position to show that \£. r a \ only contains finite arithmetic progressions. 
Suppose to the contrary that it contains an infinite progression. Then there exist A, /i G 
IN, \i > 0, and, for each t G IN, indices n t G IN, j t G {1, . . . , m} such that 

\ + fit = xM >C nt . 

Given t, the sequence n , . . . ,n mt contains at least t distincts numbers. Therefore 

Vt G IN, A + fimt > C\ 

a contradiction. 

(iii) The case of a perfect square. Let a = (3 2 and (3 be an odd integer. The case (3 even 
is treated in the same way. 

We want to compute r, s such that a vals(a p b q ) = vals(a r b s ), i.e. 

[2(r + s) + 3] 2 - (3 2 {2{p + q) + 3] 2 = 8r - 8p(3 2 - 9(/3 2 - 1). 

Let I — p + q, I' — r + s. Then 

al(l + 1) < 2av&l s {a p b q ) < al(l + 3) and /'(/' + 1) < 2v&\ s {a r b s ) < /'(/' + 3). 

Therefore, + 1) < /3 2 Z(/ + 3) and (3 2 l{l + 1) < /'(/' + 3). From this, it follows easily 
that 

r + s = P(p + q) + 



and thus 



f r = r<(p, 9 ) := /3(z + l)p -/?(/?- z - l)g + + 2i + 2) 2 - 9] 
\ s = 8i (p, q) := -Pip + PiP - i)q - \[{fi + 2i) 2 - 9] - 1 

for some i G {— 1, . . . , P — 1}. These equations together with the conditions r, s > 
define /3 + 1 regions Ri which divide IN 2 . 

The regular subsets of a* b* are the finite unions of sets of the form 

D = { a y+fz b w+9X : f,g> 0}, 

w, x,y,z > 0. Substituting y + fz and w + gx in place of p and g respectively in r^p, q) 
and Sj(p, g), one sees that D' = r^fa vals(D fl -Rj)] is of the form (f|) of lemma [17] below, 
the matrix A being 

zPii + 1) -xP(P-i-l) 
—zPi xP(P — i) 

One can apply the lemma to see that D' is regular except if i = — 1 or xz = 0. In these 
cases, D' is easily shown to be regular by direct inspection. □ 
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Lemma 17 Let A be a non singular p x p integral matrix. For i = 1, . . . ,p, set 

hi(n) = AnTii H h A ip n p - b h 

where n = (m, . . . , n p ) G IN P and 61, . . . , b p G TL. If the entries of dtm(y4)y4 _1 are non 
negative, then the language 

£ = K 1 . . . a h p v : h^n) > 0, . . . , h p (n) > 0, n G W} (4) 

is a regular subset of a\ ... a* 

Proof. If n G IN P satisfies /ij(n) > then (An); = 6j + Ui, i.e. 

p 

ni = E(^ _1 )ii(6i+«j) 5 ( 5 ) 
3=1 

for some itj G IN. 

We need to describe those u = (ui, . . . , w p ) G IN p for which @ defines non negative 
integers rij. 

If dtm(A) < 0, the entries of A" 1 are negative, there are finitely many such u and 
C is finite. If dtm(A) > 0, (A^ 1 ).^ > 0, for large enough u/s, (^) defines thus positive 
numbers but it remains to ensure that they are integers. To that purpose, since 
A^ 1 = A/dtm(A), where the entries of A are natural numbers, it is necessary and 
sufficient that the remainders Tj G {0, . . . , dtm(A) — 1} of the division of Uj by dtm(A) 
satisfy 

v 

Aij{bj + Tj) = (mod dtm(A)). 

i=i 

There is a finite number of such (ri, . . . , r p ) so that £ is a finite union of regular languages 
of the form 

/ dtm(A)\* a sidtm(A)+n / dtm(A)^ * a s p dtm(A)+r p 

(The s/s are choosen to guarantee that the w/s be large enough for the corresponding 
rij's to be non negative.) □ 



7 Appendix 

a) The next proposition sumarizes the well known facts that are used in he proof of 
theorem The reader will find in f?|, ^(J the material necessary to achieve its proof. 



Proposition 18 Assume that a G IN is not a perfect square and that N > is a natural 
number. 

(i) The set of solutions (X,Y) G IN 2 of the equation X 2 — aY 2 = N is the (finite) union 
of the sequences (X n , Y n ) ne ^ defined by 

5;;M: a ;)(?)- vie "°<^«^> < 6 > 

where (u,v) G IN 2 is the minimal non trivial solution of U 2 — aV 2 = 1, i.e. that for 
which u > 1 is the smallest. 
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(it) Each component of any solution (X n ,Y n ) ne ^ of ^) are solutions of 

Z t+2 = 2uZ i+l -Z h Vi E IN. 

In particular, X 2n , X 2n +i, Y 2n and Y 2n+ i are of the same parity as X ,Xi,Y and Y\ 
respectively. 

(Hi) For any solution (X„,F n ) n6 ]N of one has X n > u n . □ 



b) Taking advantage of lemmas 13 and 14, we give another proof of theorem f|, based 
on the notion of substitution. 

Lemma 19 0, |], |J A subset XofJNis recognizable in base p if and only if the char- 
acteristic sequence of X is generated by a p- substitution. □ 



Lemma 20 @ The set of the infinite words generated by p- substitution is closed under 
finite transduction. □ 



Proof of theorem [|. We use the notations of proposition 13. The set u(h(L)) is a regular 
subset of {0, . . . , — 1}* and by lemma |19l, the characteristic sequence \l/ of 7Ti£| [z/(/i(L))] 



is generated by a |E (-substitution. To conclude, use lemma |20| and observe that the 
characteristic sequence of 7T\Y,\[i , (h(p-\- INg))] is the image of \1/ under the following finite 
transducer (the tail has p nodes and the head counts q of them) 




Figure 2. The finite transducer for n\Y I \[v{h{p + INg))]. 

each state has a loop which corresponds to the reading and the writing of 0. □ 

c) The nature of the S-recognizable sets seems to depend strongly on the system 5*. In 
standard numeration systems with integer basis, the set of squares is not recognizable 
0| while an example of system for which it is recognizable may be found in [|l6j, p. 141. 
Here is another example, based on lemma |9]. 

Proposition 21 Let S = (a*b* U a*c* , {a,b, c}, a < b < c). The set {n 2 : n E IN} is 
S -recognizable. 

Proof. Indeed, since #((a*6* U a*c*) fl = 2n + 1, the greatest word of length n in 
a* b* U a*c* has numerical value n 2 . □ 

Using the same idea, one can easily produces various examples of unusual recogniz- 
able sets, such as {v n : n E IN} for any regular language L. 
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